van hoof
GLOP: Learning Global Partition and Local Construction for Solving Large-scale Routing Problems in Real-time
Ye, Haoran, Wang, Jiarui, Liang, Helan, Cao, Zhiguang, Li, Yong, Li, Fanzhang
The recent end-to-end neural solvers have shown promise for small-scale routing problems but suffered from limited real-time scaling-up performance. This paper proposes GLOP (Global and Local Optimization Policies), a unified hierarchical framework that efficiently scales toward large-scale routing problems. GLOP partitions large routing problems into Travelling Salesman Problems (TSPs) and TSPs into Shortest Hamiltonian Path Problems. For the first time, we hybridize non-autoregressive neural heuristics for coarse-grained problem partitions and autoregressive neural heuristics for fine-grained route constructions, leveraging the scalability of the former and the meticulousness of the latter. Experimental results show that GLOP achieves competitive and state-of-the-art real-time performance on large-scale routing problems, including TSP, ATSP, CVRP, and PCTSP.
BO-Muse: A human expert and AI teaming framework for accelerated experimental design
Gupta, Sunil, Shilton, Alistair, A, Arun Kumar V, Ryan, Shannon, Abdolshah, Majid, Le, Hung, Rana, Santu, Berk, Julian, Rashid, Mahad, Venkatesh, Svetha
Bayesian Optimization (BO) (Shahriari et al., 2015) is a popular sample-efficient optimization technique to solve problems where the objective is expensive. It has been successfully applied in diverse areas (Greenhill et al., 2020) including material discovery (Li et al., 2017), alloy design (Barnett et al., 2020) and molecular design (Gómez-Bombarelli et al., 2018). However, standard BO typically operates tabula rasa, building its model of the objective from minimal priors that do not include domain-specific information. While there has been some progress made incorporating domain-specific knowledge to accelerate BO (Li et al., 2018; Hvarfner et al., 2022) or transfer learning from previous experiments (Shilton et al., 2017), it remains the case that there is a significant corpus of knowledge and expertise that could potentially accelerate BO even further but which remain largely untapped due to the inherent complexities involved in knowledge extraction and exploitation. In particular, this often arises from the fact that experts tend to organize their knowledge in complex schema containing concepts, attributes and relationships (Rousseau, 2001), making the elicitation of relevant expert knowledge, both quantitative and qualitative, a difficult task.
Combining Reinforcement Learning and Optimal Transport for the Traveling Salesman Problem
Goh, Yong Liang, Lee, Wee Sun, Bresson, Xavier, Laurent, Thomas, Lim, Nicholas
The traveling salesman problem is a fundamental combinatorial optimization problem with strong exact algorithms. However, as problems scale up, these exact algorithms fail to provide a solution in a reasonable time. To resolve this, current works look at utilizing deep learning to construct reasonable solutions. Such efforts have been very successful, but tend to be slow and compute intensive. This paper exemplifies the integration of entropic regularized optimal transport techniques as a layer in a deep reinforcement learning network. We show that we can construct a model capable of learning without supervision and inferences significantly faster than current autoregressive approaches. We also empirically evaluate the benefits of including optimal transport algorithms within deep learning models to enforce assignment constraints during end-to-end training.
Neural combinatorial optimization beyond the TSP: Existing architectures under-represent graph structure
Boffa, Matteo, Houidi, Zied Ben, Krolikowski, Jonatan, Rossi, Dario
Recent years have witnessed the promise that reinforcement learning, coupled with Graph Neural Network (GNN) architectures, could learn to solve hard combinatorial optimization problems: given raw input data and an evaluator to guide the process, the idea is to automatically learn a policy able to return feasible and high-quality outputs. Recent work have shown promising results but the latter were mainly evaluated on the travelling salesman problem (TSP) and similar abstract variants such as Split Delivery Vehicle Routing Problem (SDVRP). In this paper, we analyze how and whether recent neural architectures can be applied to graph problems of practical importance. We thus set out to systematically "transfer" these architectures to the Power and Channel Allocation Problem (PCAP), which has practical relevance for, e.g., radio resource allocation in wireless networks. Our experimental results suggest that existing architectures (i) are still incapable of capturing graph structural features and (ii) are not suitable for problems where the actions on the graph change the graph attributes. On a positive note, we show that augmenting the structural representation of problems with Distance Encoding is a promising step towards the still-ambitious goal of learning multi-purpose autonomous solvers.
Multi-Decoder Attention Model with Embedding Glimpse for Solving Vehicle Routing Problems
Xin, Liang, Song, Wen, Cao, Zhiguang, Zhang, Jie
We present a novel deep reinforcement learning method to learn construction heuristics for vehicle routing problems. In specific, we propose a Multi-Decoder Attention Model (MDAM) to train multiple diverse policies, which effectively increases the chance of finding good solutions compared with existing methods that train only one policy. A customized beam search strategy is designed to fully exploit the diversity of MDAM. In addition, we propose an Embedding Glimpse layer in MDAM based on the recursive nature of construction, which can improve the quality of each policy by providing more informative embeddings. Extensive experiments on six different routing problems show that our method significantly outperforms the state-of-the-art deep learning based models.
Learning Improvement Heuristics for Solving the Travelling Salesman Problem
Wu, Yaoxin, Song, Wen, Cao, Zhiguang, Zhang, Jie, Lim, Andrew
Recent studies in using deep learning to solve the Travelling Salesman Problem (TSP) focus on construction heuristics, the solution of which may still be far from optimal-ity. To improve solution quality, additional procedures such as sampling or beam search are required. However, they are still based on the same construction policy, which is less effective in refining a solution. In this paper, we propose to directly learn the improvement heuristics for solving TSP based on deep reinforcement learning. We first present a reinforcement learning formulation for the improvement heuristic, where the policy guides selection of the next solution. Then, we propose a deep architecture as the policy network based on self-attention. Extensive experiments show that, improvement policies learned by our approach yield better results than state-of-the-art methods, even from random initial solutions. Moreover, the learned policies are more effective than the traditional handcrafted ones, and robust to different initial solutions with either high or poor quality. 1 Introduction The Travelling Salesman Problem (TSP) is a typical combinatorial optimization problem that has extensive applications in the real world. The problem statement is straightforward: given a set of locations, find the salesman a shortest tour that traverses each location exactly once and returns to the original one. Although having been widely studied for decades, achieving satisfactory performance is still challenging due to its NPhard complexity.
State Representation Learning for Control: An Overview
Lesort, Timothée, Díaz-Rodríguez, Natalia, Goudou, Jean-François, Filliat, David
Representation learning algorithms are designed to learn abstract features that characterize data. State representation learning (SRL) focuses on a particular kind of representation learning where learned features are in low dimension, evolve through time, and are influenced by actions of an agent. As the representation learned captures the variation in the environment generated by agents, this kind of representation is particularly suitable for robotics and control scenarios. In particular, the low dimension helps to overcome the curse of dimensionality, provides easier interpretation and utilization by humans and can help improve performance and speed in policy learning algorithms such as reinforcement learning. This survey aims at covering the state-of-the-art on state representation learning in the most recent years. It reviews different SRL methods that involve interaction with the environment, their implementations and their applications in robotics control tasks (simulated or real). In particular, it highlights how generic learning objectives are differently exploited in the reviewed algorithms. Finally, it discusses evaluation methods to assess the representation learned and summarizes current and future lines of research.